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Search engine computer system, has natural language processor in search 
component that translates search terms received by user interface into 
prioritized clustered tokens 

Patent Assignee: MICROSOFT CORP (MICT) 
Inventor: JAYANTI H; KONASEWICH P; STUMPF M D 
Patent Family (1 patents, 1 countries) 
Patent Application 

Number Kind Date Number Kind Date Update 

US 6775666 Bl 20040810 US 2001867228 A 20010529 200459 B 

Priority Applications (no., kind, date) : [US 2001867228 A 2001052^ 

Patent Details 

Number Kind Lan Pg Dwg Filing Notes 

US 6775666 Bl EN 30 20 

Alerting Abstract US Bl 

NOVELTY - A user interface receives user-defined search terms in 
searchable content database including an index from an information source, 
and displays the results. A search component searches the terms and 
retrieves the information containing the search term. A natural language 
processor in the search component translates the search terms into 
prioritized clustered tokens. 

DESCRIPTION - INDEPENDENT CLAIMS are also included for the following: 

1. method for searching and retrieving information; and 

■2. computer readable medium storing information searching program. 

USE - For index searching queries in computer system e.g. Internet. 

ADVANTAGE - Users can be directed to general or specific content within 
an article outline and related articles. Multiple query styles can be 
searched to find relevant matches. User queries are analyzed to 
determine most /less important elements , and can be formed in an 
ad-hoc, free-form manner. Treatment of hierarchical index data can be 
combined with a natural language processor to provide more accurate and 
detailed access to indexed content. Retrieves search results within less 
processing time. 

DESCRIPTION OF DRAWINGS - The figure shows the architecture of the 
computer system. 

Title Terms/Index Terms /Additional Words: SEARCH; ENGINE; COMPUTER; SYSTEM; 
NATURAL; LANGUAGE; PROCESSOR; COMPONENT; TRANSLATION; TERM; RECEIVE; USER 
; INTERFACE; CLUSTER; TOKEN 

Class Codes 

International Classification (Main) : G06F-017/30 
US Classification, Issued: 707005000, 707004000 

File Segment: EPI; 
DWPI Class: TOl 

Manual Codes (EPI/S-X) : T01-J16C3; T01-J16C6; T01-N03A2; T01-S03 



Alerting Abstract ...containing the search term. A natural language 
processor in the search component translates the search terms into 
prioritized clustered tokens .... within an article outliae and related 
articles. Multiple query styles can be searched to find relevant matches. 

User queries are' analyzed to determine most /less important 
elements , and can be formed in an ad-hoc, free-form manner. Treatment of 

hierarchical index data can be combined with a natural language 
processor to provide more accurate and detailed access to indexed content. 
Retrieves ... 

Original Publication Data by Authority 

Original Abstracts : 

A method and system for searching index databases allows a user to search 
for specific information using high- level key words , questions, or 
sentences . The system includes three main segments: a searchable content 
database, a run time search component, and. . , 

...exact match search, a natural language processor (NLP) , and a full text 
search. Indexes, prioritized search tokens, and word clusters are 
combined to create a better search experience. A user's query is 
processed into prioritized clustered tokens using the NLP, token priority 
rules, and word clusters. 
Claims : 

...the search terms, the search component comprising a natural language 
processor for translating the search terms into prioritized clustered 
tokens . 
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Data corpus topics identifying method for information retrieval systems, 
involves designating word combination as topic if determined 
segment-level actual usage value of combination is greater than 
expected usage _va.lue . 

Patent As signee^/^PL I ANT TECHNOLOGIES" ^NC (PLIA-N) 

Inventor :(^AKILESWAR_S; CHILDERS R; KOTLAR d7 ODOM" P S^"^ 
Patent Family '(!' patents, 1 countries) • - - 

Patent Application 

Number Kind Date Number Kind Date Update 

US 20030167252 Al 20030904 US 200286026 A 20020226 200367 B 

Priority Applications (no., kind, date) : CuS Jp0286b26 A 20JD^202T6^ 

Patent Details 

Number Kind Lan Pg Dwg Filing Notes 

US 20030167252 Al EN 15 ^8 
Alerting Abstract US Al 

NOVELTY - The method, involves determining a segment - level actual 
usage value for word combinations . A segment - level expected 
usage value is computed for each word combination . A. word 
combination is designated as a topic if the segment - level actual 
usage value of the combination is greater than the segment - level 
expected usage value . One word in each word combination is 
selected from a set of word lists. 

DESCRIPTION - INDEPENDENT CLAIMS are also included for the following: 

1. a program storage device to identify topics in a data corpus 

2. a method to display a list of topics associated with data items stored 
in a database. 

USE - Used for identifying topics in a data corpus of information 
retrieval systems. 

ADVANTAGE - The method provides a fast, low cost and automated 
classification of large amounts of data that is consistent with the 
semantic content of the data. The method provides a collection of topics 
that are used to guide information retrieval and the display of topic 
classifications during user query operations. 

DESCRIPTION OF DRAWINGS - The drawing shows a flow chart representing the 
method to identify topics in a corpus of data. 

Title Terms/Index Terms /Additional Words: DATA; CORPUS; TOPIC; IDENTIFY; 
METHOD; INFORMATION; RETRIEVAL; SYSTEM; DESIGNATED; WORD; COMBINATION; 
DETERMINE; SEGMENT; LEVEL; ACTUAL; VALUE; GREATER 

Class Codes 

International Classification (Main) : G06F-017/30 
US Classification, Issued: 707001000 

File Segment: EPI; 
DWPI Class: TOl 

Manual Codes (EPI/S-X) : T01-J05B; T01-S03 



Data corpus topics identifying method for information retrieval systems, 

involves designating word combination as topic if determined 
segment-level actual usage value of combination is greater than 
expected usage value 

...NOVELTY - The method involves determining a segment - level 
actual usage value for word combinations . A segment - level 
expected usage value is computed for each word combination . A word 
combination is designated as a topic if the segment - level actual 
usage value of the combination is greater than the segment - level 
expected usage value . One word in each word combination is 
selected from a set of word lists. 

Original Publication Data by Authority 
Original Abstracts: 

A technique to determine topics associated with, or classifications 

for, a data corpus uses an initial domain-specific word list to identify 
word combinations (one or more words ) that appear in the data 
corpus significantly more often than expected. Word combinations so 
identified are selected as topics and associated with a user- 
specified level of granularity. For example, topics may be associated 
with each table entry, each image, each sentence, each paragraph, or an 
entire file. Topics may... 
Claims : 

...claimed is:<b>l</b>. A method to identify topics in a data corpus having 
a plurality of segments , comprising: determining a segment-level 
actual usage value for one or more word combinations ; computing a 
segment- level expected usage value for each of the one or more 
word combinations ; and designating a word combination as a 
topic if the segment - level actual usage value of the word 
combination is substantially greater than the segment - level 
expected usage value of the word combination . > 
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Electronic dociament file locating, ranking and marking method in internet, 
involves ordering group of pie charts that represent collection of 
retrieved documents, hierarchically based on relevance of common key word 

Patent Assignee: ARAHA INC (ARAH-N) 
Inventor: HUSSAM A A 

Patent Family (1 patents, 1 countries) 
Patent Application 

Number Kind Date Number Kind Date Update 

US 20030050927 Al 20030313 US 2001318168 P 20010907 200352 B 

US 2002127638 A 20020422 

Priority Applications (no., kind, date) :rus "2001318*168 ' P " 20010907^;\ US 
2002127638 A 20020422 ~ 



Patent Details 

Number Kind Lan Pg Dwg Filing Notes 

US 20030050927 Al EN 54 48 Related to Provisional US 2001318168 
Alerting Abstract US Al 

NOVELTY - A subset of electronic document files having a selected 
common key word is retrieved from a universe of electronic document 
files. The key word is marked with a color highlighter. A group of 
perceptible pie charts corresponding to documents is provided. The pie 
charts that represent the collection of retrieved documents are ordered, 
hierarchically^ based on relevance of key word. 

"DESCRIPTION^ - INDEPENDENT CLAIMS are also included for the following: 

1. method of providing abstract visual representations of a desired subset 
of data derived from a set of data; 

2. system of organizing an arbitrary collection of electronic documents; 

3. method of extracting and arranging a subset of electronic documents 
from a larger group of electronic documents; method of information 
gathering and encoding; 

' 4. method of organizing and sharing electronic document files among 
multiple users; and 

5. method of information acquisition. 

USE - For locating, ranking and marking electronic document including 
hypertext markup language (HTML) documents in internet or intranet. 

ADVANTAGE - The semantic highlighting enhances the rate at which people 
locate and understand web-based documents is enhanced and allows for 
metadata that is not static to be created by the author or other users of 
the document. By using visual metadata in the form of pie charts, the user 
is allowed to perform rapid assessment of the relevance of documents 
located by the search engine. 

DESCRIPTION OF DRAWINGS - The figure shows a flow chart illustrating a 
task analysis 'for locating and using a document. 



Title Terms/Index Terms /Additional Words: ELECTRONIC; DOCUMENT; FILE; 

LOCATE; RANK; MARK; METHOD; ORDER; GROUP; PIE; CHART; REPRESENT; COLLECT; 



RETRIEVAL; HIERARCHY ; BASED; RELEVANT; COMMON; KEY; WORD 



Class Codes 

International Classification (Main) : G06F-017/30 
US Classification, Issued: 707005000 

File Segment: EPI; 
DWPI Class: TOl 

Manual Codes jEPI/S-X) : T01-J05B1; T01-N03A2; T01-N03B2 

. . .method in internet, involves ordering group of pie charts that represent 
collection of retrieved documents, hierarchically based on relevance of 
. common key word 

Alerting Abstract ...NOVELTY - A subset of electronic document files 
having a selected common key word is retrieved from a universe of 
electronic document files. The key word is marked with... 

...documents is provided. The pie charts that represent the collection of 
retrieved documents are ordered, hierarchically based on relevance of key 
word .... visual metadata in the form of pie charts, the user is allowed to 
perform rapid assessment of the relevance of documents located by the 
search engine. 

Title Terms. /Index Terms /Additional Words: HIERARCHY ; 
Original Publication Data by Authority 



Original Abstracts: 

...within a universe of preexisting documents to extract a subset of 
relevant documents is disclosed. The user selects search terms or 
key words , and an application program performs a search of the 
universe of documents, compiles a subset or collection of documents based 
upon the search terms or keywords selected, and presents the 
resulting collection of documents to the user. An abstract marker such 
as a color highlighter, e.g... 
Claims : 

...each electronic document file within said subset of electronic document 
file with a first abstract indicia rproviding a group of second abstract 
indicias each corresponding to a document in said subset of electronic 
documents, said second abstract indicias being perceptible; andordering 
said group of abstract indicias hierarchically based upon the relevance 
of said characteristic. 
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WPI ACC NO: 2002-758658/200282 
XRPX Acc No: N2002-597251 

Text input method in virtual environment, involves monitoring positions of 
fingers by sensor glove that is calibrated with dynamic threshold values 
indicating occurrence of finger press 

Patent Assignee: UNIV NEW YORK STATE RES FOUND (UYNY) 
Inventor: EVANS F; SKIENA S; VARSHNEY A 
Patent Family (1 patents, 1 countries) 
Patent Application 

Number Kind Date Number Kind Date Update 

US 6407679 Bl 20020618 US 199894910 P 19980731 200282 B 



Priority Applications (no., kind, date) :rus 199894 910 P 19980731?; US 
1999364433 A 19990730 " ' ' ' ^ 

Patent Details 



Alerting Abstract US Bl 

NOVELTY - A sensor glove is calibrated by establishing threshold values 
, when a user enters a sample sequence. The positions of the fingers are 
monitored by the sensor glove. If a finger press has passed the threshold 
value, a feedback is provided to the user to indicate that a key is entered 
and the key is stored. Key words are separated by recognizing spaces in the 
stored sequence of keys and are matched with words in a dictionary, to 
select most probable word sequence . 

USE - For entering text including Chinese symbols in a virtual 
environment created by computer system. 

ADVANTAGE - Recognizes fine finger movements such as key entry on a 
virtual key board using low cost, low resolution sensor glove. Eliminates 
the inherent noise in the finger movement data by using a low-pass Gaussian 
filter. 

DESCRIPTION OF DRAWINGS - The figure shows a flowchart explaining finger 
press recognition procedure. 

Title Terms/Index Terms/Additional Words: TEXT; INPUT; METHOD; VIRTUAL; 

ENVIRONMENT; MONITOR; POSITION; FINGER; SENSE; GLOVE; CALIBRATE; DYNAMIC; 
THRESHOLD; VALUE; INDICATE; OCCUR; PRESS 

Class Codes 

International Classification (Main) : H03M-011/00 

US Classification, Issued: 341020000, 400475000, 400479200, 345702000, 
345811000, 345168000, 345773000 

File Segment: EPI; 
DWPI Class: ■U21 

Manual Codes (EPI/S-X) : U21-A05D 

... NOVELTY - A sensor glove is calibrated by establishing threshold values 
, when a user enters a sample sequence. The positions of the fingers are 
monitored by the sensor glove... 

...sequence of keys and are matched with words in a dictionary, to select 
most probable word sequence . 



US 1999364433 



A 19990730 



Number 

US 6407679 



Kind Lan 
Bl EN 



Pg Dwg Filing Notes 

52 9 Related to Provisional US 199894910 



Original Piiblication Data by Authority 



Claims : 

...character for each finger press movement; calibrating the at least one 
glove by establishing threshold values through a user inputting a 
sample sequence, said threshold values indicating an occurrence of 
finger press; monitoring the positions of the plurality of the fingers 
the at least one sensor glove... 

. . .matching the key words with one or more words in the dictionary, 
generating all possible permutations of word sequences , and 
selecting the most probable word sequence or partial sentence; 
when a partial sentence is selected, generating feedback to the user 
concerning the selected partial sentence; returning to said 
monitoring step ; and if the user indicates the end of a sentence, 
erasing the stored sequence of keys, storing the last most probable 
word sequence as a sentence, and returning to said monitoring step. 
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Automatic summary generation method for text documents, involves 
calculating sum of scores of words and sentences extracted from document, 
using which top-ranked sentences and key word list are generated and output 

Patent Assignee: GUO Z L (GUOZ-I); INT BUSINESS MACHINES CORP (IBMC); 

YANG L P (YANG- I) 
Inventor: GUO Z L; YANG L P 
Patent Family (2 patents, 1 countries) 
Patent Application 

^Number Kind Date Number Kind Date Update 

US 20020052901 Al 20020502 US 2001943341 A 20010831 200252 B 
US 7017114 B2 20060321 US 2001943341 A 20010831 200621 E 

Priority Applications (no., kind, date) : rus"2001943341 A 200ip83]!l- CN 
2000128668 A 20000907 " - - - 

Patent Details 

Number Kind Lan Pg Dwg Filing Notes 

US 20020052901 Al EN 7 2 

Alerting Abstract US Al 

NOVELTY - A set of sentences and words are extracted from document by 
different processes. A score is set for each word and the sentence. The 
sum of the scores of words and that of sentences are calculated and if the 
calculated scores change apparently, the sum of the scores is computed 
again. The top-ranked sentences determined with respect to scores are 
output as summary and the top-ranked words are output as keyword list of 
the document, 

DESCRIPTION - An INDEPENDENT CLAIM is included for a computer program 
product for automatically generating summaries for text documents. 

USE - For generating summary for text documents automatically. 

ADVANTAGE - A comprehensive summary including the important ideas' of 
document is generated efficiently. 

DESCRIPTION OF DRAWINGS - The figure shows a flowchart for automatic 
summary generation process. 

Title Terms/Index Terms /Additional Words: AUTOMATIC; SUMMARY; GENERATE; 
METHOD; TEXT; DOCUMENT; CALCULATE; SUM; SCORE; WORD; SENTENCE; EXTRACT; 
TOP; RANK; KEY; LIST; OUTPUT 

Class Codes 

International Classification (Main) : G06F-017/21 
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...NOVELTY - A set of sentences and words are extracted from document 
by different processes. A score is set for each word and the sentence. 
The sum of the scores of words and that of sentences are... 

Original Piiblication Data by Authority 
Original Abstracts: 

A method and program product to generate summaries for text documents. A 
user can also specify a query, topic, and terms that he /she is 
interested in. This method determines the importance of each sentence 
by using the linguistic salience of the word to the user profile , 
the similarity among the word, the query and topic provided by a user and 
the. . . 

. . .word, this method computes the score for each sentence in the set of 
sentences according to the score of words composing it and the position 
of the sentence in a section and a paragraph... 

...A method and program product to generate summaries for text documents. A 
user can also specify a query, topic, and terms that he/she is 
interested in. This method determines the importance of each sentence 

by using the linguistic salience of the word to the user profile, 
the similarity among the word , the query and topic provided by a user 
and the sum of scores of the. . . 

. . .After computing the score for each word, this method computes the score 
for each sentence in the set of sentences according to the score of 
words composing it and the position of the sentence in a section and a 
paragraph. 
Claims : 

...<b>l</b>. An automatic method for generating summaries for text 
documents, comprising steps of : generating a set of sentences for a set 

of documents by document discourse analysis and a set of words by 
morphologic process / initializing a score for each word in the set of 
words and for each sentence in the set of sentences ; computing the 
score for each word in the set of words according to the score of 
sentences containing it and the correlation degree between the word 
and the user information; computing the score for each sentence in the set 
of sentences according to the score of words composing it and the 
position of the sentence in. . . 

...a set of sentences for a set of documents by document discourse analysis 
and a set of words by morphologic process ; initializing a word score for 

each word in the set of words , a sentence score for each sentence 
in the set of sentences and a score sum; computing an aggregated word 
score for said each -word according to an aggregate of sentence 
scores of sentences containing said each word and to a degree of 
correlation between said each word and user related information; wherein 
said aggregated word score ( SCORE [w] ) has a weighted ( lambda ) 
relationship with each of said aggregated sentence score (SCORE[s]), 
linguistic salience of said each word to a user profile (salience (w, user 
summarization profile) ) , similarities among said each word , a query 
and a provided topic (salience (w, user *s query or topic)), similarities 
among said each word and terms in titles of the documents 
(salience (w, tile words)), a ratio of an occurrence number for said each 
word in a document to a total occurrence number for said each word in the 

set of documents (FREQUENCY ( w /d) /FREQUENCY (w/D) ) , and a ratio of a 



number of documents including said each word to a total number of 
documents in the set of documents (NUMBER{d, dw)/ NUMBER (D) ) , of 
the 

form</br>SCORE [w] =lambdal*salience (w, usersummarizationprof ile) +lambda2*sali 
ence (w, user * squeryortopic) +lambda3*Sigma (SCORE. . . 

.w) /NUMBER (D) ; computing an aggregated sentence score for said each 
sentence according to an aggregate of word scores composing said each 
sentence and a respective sentence position in a section and a 
paragraph ; comparing an aggregate sum with said score sum^ said 
aggregate sum being a sum of aggregated word scores and aggregated 
sentence scores; andif said aggregate sum is different than said score sum, 

returning to the step of computing the aggregated word scare; 
otherwise, outputting top-ranked sentences according to sentence score as a 
summary of the 
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Document classification for information retrieval system, involves 
comparing created term and document vectors and storing document at 
location relative to category node with term vector with preset relevance 
ranking 

Patent Assignee': SUN MICROSYSTEMS INC (SUNM) 
Inventor: MOCKER J D; SNOW W A 
Patent Family (1 patents, 1 countries) 
Patent Application 

Number Kind Date Number Kind Date Update 

US 6185550 Bl 20010206 US 1997874783 A 19970613 200152 B 

Priority Applications (no., kind, date) :VUS 19978747 83 A "1997 0 613} 

Patent Details 

Number Kind Lan Pg Dwg Filing Notes 

US 6185550 Bl EN 19 9 

Alerting Abstract US Bl 

NOVELTY - Term vectors containing weights assigned to each of one or more 
common terms in the corresponding terms file are created and are compared 
with created document vectors of a document to provide relevance ranking 
between the terms file and document. The document is stored at a location 
corresponding to category node having a term vector which has a relevance 
ranking that matches a selected criteria. 

DESCRIPTION - A class hierarchy is created by providing several 
category nodes, each of which create term files. Class hierarchy having a 
root category node within a free data structure is initialized and 
displayed. User selected commands for manipulating the class hierarchy 
are entered. A category command is processed in response to the user 
selected command having predefined state which causes the class hierarchy 
to contain several category nodes. Category nodes include category name, 
node type, node ID, parent ID, link ID which are all stored in the 
database. When the node type is predefined type a new category node is 
allowed to be added- to the selected category nodes, otherwise new category 
node is prevented from being added to the category nodes. The node ID 
defines the unique directory. The parent ID is indicating the node ID of a- 
parent category node. The link ID is indicating the node ID of several 
category nodes when the node type is of a predetermined type. INDEPENDENT 
CLAIMS are also included for the following: 

1 . Document classifying; 

2 . Document classification program 

USE - For classification of documents within defined categories using 
class hierarchy in information retrieval system. 

ADVANTAGE - Since the automatic document classification within user 
defined categories is provided, the user can interactively search for 
documents according to search terms defined within user defined categories. 
Since documents are ranked according to relevance and a user specified 
number of documents which are most relevant are returned, multiple 
users can access the document via network. 

DESCRIPTION OF DRAWINGS - The figure shows the flowchart of main 
procedure utilized in creation of the document directory hierarchy . 
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Original Titles: 
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Alerting Abstract DESCRIPTION - A class hierarchy is created by 
providing several category nodes, each of which create term files. Class 
hierarchy having a root category node within a free data structure is 
initialized and displayed. User selected commands for manipulating the 
class hierarchy are entered. A category command is processed in response 
to the user selected command having predefined state which causes the class 

hierarchy to contain several category nodes. Category nodes include 
category name, node type, node ID, parent... 

...USE - For classification of documents within defined categories using 
class hierarchy in information retrieval system. 

. . .documents according to search terms defined within user defined 
categories. Since documents are ranked according to relevance and a 
user specified number of documents which are most relevant are 
returned, multiple users can access the document via network. . . 

...The figure shows the flowchart of main procedure utilized in creation of 
the document directory hierarchy . 

Original Publication Data by Authority 



Original Abstracts: 

A method for classifying a document based on content within a class 
hierarchy . The class hierarchy comprises a plurality of category nodes 
stored within a tree data structure. Each of the... 

...includes a category name corresponding to a unique directory and a 
category definition comprising a set of defining terms . The class 
hierarchy is searched to determine appropriate categories for 
classification of the document. The document is then... 
Claims : 

A method for classifying a document based on content within a class 
hierarchy , the method comprising : initializing the class hierarchy , the 
class hierarchy having a root category node within a tree data 
structure, the root category node having a user-defined category 
name ; displaying the class hierarchy ; accepting a user-selected command 
for 'manipulating the class hierarchy ;processing a category command in 
response to the user-selected command having a first predefined state, 
causing the class hierarchy to contain a plurality of category nodes, 
said processing the category command further comprising : storing a 



category name in one of the plurality of category... 

...plurality of category nodes when the nodetype is of a predefined 
type; creating a class hierarchy by providing a plurality of category 
nodes stored in a tree data structure within a memory, each of said 
plurality of category nodes having a category name corresponding to a 
unique directory and a set of defining terms ; creating a plurality of 
terms files, each of said plurality of terms files corresponding to one of 
said plurality of category nodes and including a corresponding set of 

defining terms and one or more document fragments stored under said one 
of said plurality of category nodes, said set of defining terms ■ 
including a term corresponding to one of said plurality of category nodes 
and said one or more document fragments including a reference to one or. 
more docximents and indexing information indicating contiguous 
multi-term portions of said documents to be extracted during indexing, said 

set of defining terms and said document fragments together providing a 
definition of files to be contained in said unique directory referenced by 
said one of said plurality of category nodes; creating one or more term 
vectors for each of said terms files... 

...vector containing a weight assigned to the terms of the document 
according to frequency of occurrence ; providing a relevance ranking 
between said terms files and said document by comparing said document 
vector with said one or more term vectors; andstoring said document within 
said document directory hierarchy at a location corresponding to a 
category node having a term vector which has a relevance ranking that 
matches a selected criteria. 
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Alerting Abstract US A 

-NOVELTY - A user selected portion (304) of the source document (300) is 
received and from which terms are selected. For a selected term, hypertext 
link to target document (310) is automatically created with relevance to 
the term. A menu of hypertext links to the target documents is created with 
selected term as link anchor. 

DESCRIPTION - An INDEPENDENT CLAIM is also included for computer 
implement system for dynamically generating content dependent hypertext 
links . 

USE - For dynamic generation of hypertext links from source document to 
target document in world wide web sites. 

ADVANTAGE - By generating hypertext links dynamically, the document 
appropriate to user interests is customized effectively. The user has full 
control over the semantic content used in defining links. The static 
generation of tags permits links in the source document to be current at 
all times while giving the publisher editorial control over tags and 
keywords. The publisher is able to update the tags or links of the document 
effectively using automatic generation of links, 

DESCRIPTION OF DRAWINGS - The figure shows illustration of navigation 
paradigm in hyper link generating method. 

300 Source document 

304 User selected portion 

310 Target document 
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Original Abstracts: 

A system, method, and software product create contextual hypertext links 
relevant to a user selected portion of a source document. The contextual 
links enable the user to dynamically associate... 

...and the target document when the source document was created. The method 
includes selecting terms relevant to the user selected portion by 
linguistic analysis which selects the most frequently occurring terms. 
From the selected terms target documents relevant to the selected terms are 
identified. The target documents are selected by identifying topics 
that are associated with, or described by, the selected. terms . Contextual 
links are created between... 

...the documents in the contextual links. The system includes a knowledge 
base of topics, including hierarchical relations between topics, and 
associations of topics and terms . A document collection includes 
documents and references to documents, and URL or other addressing 
information for the documents... 
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Alerting Abstract WO Al 

NOVELTY - Method presents documents thematic capsule 
users derived for entire document showing core content 
article in more accurate and representative manner than 
techniques. Overviews are delivered in a variety of pre 
allow users to quickly get the sense of what a document 
if they want to be read in more detail. 

USE - For reviewing documents and presenting them in 
the user to quickly ascertain their contents. 

ADVANTAGE - User can decide whether he or she desires 
involved in the presentation. 

DESCRIPTION OF DRAWINGS - The drawing shows a simple 
illustrating a method for the dynamic presentation of s 

206 displaying capsule overviews 



overviews (206) to 
of average length 

using conventional 
sentation modes and 
is about and decide 

a manner that allows 

to be actively 

flow chart 
everal documents . 



Title Terms/Index Terms /Additional Words: METHOD; DYNAMIC; PRESENT; CONTENT 
; DOCUMENT; DISPLAY; RECEIVE; CAPSULE; CORRESPOND 

Class Codes 

International Classification (Main) : G06F-017/21, G06F-017/30, G06F-007/00 
US Classification, Issued: 707001000, 707005000, 707003000, 715526000, 

707005000, 707001000, 707003000, 707104100, 707501100, 707530000, 
707531000, 704009000, 707005000, 707001000, 707003000, 707102000, 
707104100, 707501100, 707530000, 707531000, 707513000, 704009000, 
707005000, 707001000, 707003000, 707100000, 707104100, 715500100, 
715513000, 715-902000, 715907000 

File Segment: EPI; 
DWPI Class: TOl 

Manual Codes (EPI/S-X) : T01-J05B 



Original Titles: 

...OVERVIEWS CORRESONDING TO THE PLURALITY OF 



DOCUMENTS, 



RESOLVING 



CO-REFERENTIALITY RELATED TO FREQUENCY WITHIN DOCUMENT, DETERMINING 
TOPIC STAMPS FOR EACH DOCUMENT SEGMENTS... 

Original Publication Data by Authority 



Original Abstracts : 

...particular interest to the user. In a preferred embodiment, the capsule 
overviews include a containment hierarchy which " relates the different 
information levels in a document together, and which includes a collection 
of highly salient topic stamps embedded in layers of progressively richer 
and more informative contextualized text fragments. The novel 
presentation metaphors which the invention utilizes are based on notions... 

...particular interest to the user. In a preferred embodiment, the capsule 
overviews include a containment hierarchy which relates the different 
information levels in a document together, and which includes a collection 
of highly salient topic stamps embedded in layers of progressively 
richer and more informative contextualized text fragments . The novel 
presentation metaphors which the invention utilizes are based on notions 
of temporal typography, in particular for exploiting the interactions 
between form and content... 

...particular interest to the user. In a preferred embodiment, the capsule 
overviews include a containment hierarchy which relates the different 
information levels in a document together, and which includes a collection 
of highly salient topic stamps embedded in layers of progressively richer 
and more informative contextualized text fragments . The novel presentation 
metaphors which the invention utilizes are based on notions of temporal 
typography, in particular for exploiting the interactions between form and 
content . 

...particular interest to the user. In a preferred embodiment, the capsule 
overviews include a containment hierarchy which relates the different 
information levels in a document together, and which includes a collection 
of highly salient topic stamps embedded in layers of progressively richer 
and more informative contextualized text fragments . The novel presentation 
metaphors which the invention utilizes... 

...notions of temporal typography, in particular for exploiting the 
interactions between form and content. 



. . .particular interest to the user. In a preferred embodiment, the capsule 
overviews include a containment hierarchy which relates the different 
information levels in a document together, and which includes a collection 
of highly salient topic stamps embedded in layers of progressively richer 
and more informative contextualized text fragments. The novel presentation 
metaphors which the invention utilizes... 

...between form and content. 

A method. . . 

...read it in more detail. In a preferred embodiment, the capsule overviews 
include a containment hierarchy which relates the different information 
levels in a document together, and which includes a collection of highly 
salient topic stamps embedded in layers of progressively richer and more 
informative contextualized text fragments. The novel presentation metaphors 



which the invention utilizes. 



...form and content. 
A method for. . . 

...read it in more detail. In a preferred embodiment, the capsule overviews 
include a containment hierarchy which relates the different information 
levels in a document together, and which includes a collection of highly 
salient topic stamps embedded in layers of progressively richer and more 
informative contextualized text fragments. The novel presentation metaphors 
which the invention utilizes 
Claims : 

...document segments;3) resolving co-ref erentiality among the discourse 
referents within, and across, the document segments , wherein the 
resolving step comprises linking the discourse referents by 
co-ref erentiality with each other to assess a frequency. . . 

.. .prominence; 4 ) calculating salience values for the discourse referents 
based upon the resolving step; 5) determining topic stamps for the 
document segments based upon discourse salience values of the 
associated discourse referents; and6) providing a capsule overview of the 
document constructed from the topic stamps; andc) dynamically delivering 
document content as encapsulated within the plurality, of capsule 
overviews . . . 

... What is claimed is:l. A computer readable medium containing 

programming instructions for dynamically presenting the contents of... 

...document segments; 3) resolving co-ref erentiality among the discourse 
referents within, and across, the document segments , wherein the 
resolving step comprises linking the discourse referents by 
co-referentiality with each other to assess a frequency. . . 

...4) calculating discourse salience values for the discourse referents 
based upon the resolving step; 5) determining topic stamps for each of 
the document segments based upon discourse salience values of the 
associated discourse referents; and6) providing a capsule overview of the 
document. 
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Alerting Abstract WO Al 

NOVELTY - A collection selection query including a set of set search 
terms is obtained. An inverse collection frequency is determined for each 
search term with respect to each database and the set of databases. A 
document frequency is determined for each search term with respect to each 
database. A ranking value is determined for each database based on a sum of 
the products of the inverse collection frequencies for the search terms and 
the document frequencies for respective search terms. A subset of the set 
of databases is selected based on set criteria dependent on the ranking 
value for each database. 

DESCRIPTION - The method involves: a) obtaining a collection selection 
query including a set of set search terms , b) determining an inverse 
collection frequency for each search term with respect to each database and 
the set of databases, and determining a document frequency for each search 
term with respect to each database, c) determining a ranking value for each 
database based on a sum of the products of the inverse collection 
frequencies for the search terms and the document frequencies for 
respective search terms, d) selecting a subset of the set of databases 
based on set criteria dependent on the ranking value for each database, and 
e) selectively repeating portions of the steps (b) through (d) with 
respect to each search term for each iteration of the method. 

USE - The method is used to permit iterative performance of collection 
selection relative to a set of databases, where each database includes 
several documents, to obtain consistent relative-ranking collection 
selection results each iteration. 

ADVANTAGE - Improves selection of most relevant collections for searching 



based on an ad hoc query. 

DESCRIPTION OF DRAWINGS - The drawing shows a flow diagram illustrating 
the operation in supporting a meta-index database construction and user 
search. 
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Alerting Abstract ...NOVELTY - A collection selection query including a 
set of set search terms is obtained. An inverse collection frequency 
is determined for each search term with respect to... 

DESCRIPTION - The method involves: a) obtaining a collection selection 
query including a set of set search terms , b) determining an inverse 
collection frequency for each search term with respect to each database... 

...on set criteria dependent on the ranking value for each database, and e) 
selectively repeating portions of the steps (b) through (d) with 
respect to each search term for each iteration of the method. . . 
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Original Abstracts: 

...by exclusion of predetermined context-free single-word terms and 
punctuation; (b) applying each such selected term against a meta - index 

descriptive of the document collections; (c) determining cumulative 
rankings for the document collections relative to each such selected 
term normalized against the plurality of document collections ; and (d) 
selecting a set of the document collections having the highest relative 
cumulative rankings... 

.-..repetitive steps of determining an inverse collection frequency and a 
document frequency for each database; determining a ranking value for 
each database; selecting a subset of the set of databases based on 
predetermined criteria dependant on the ranking value for... 

...file that describes the query significant search terms that are present 
in a particular document collection correlated to normalized document 
usage frequencies of such terms within the documents of each document 
collection . By access to the meta-inf ormation data file, a relevance 
score for each of the document collections... 

...determined. The method then returns an identification of the subset of 
the plurality of document collections having the highest relevance 



scores for use in evaluating the predetermined query. The 
meta-information data file may be constructed to include document 
normalized term frequencies and other contextual information that can be... 
Claims : 

...and information about documents in the corresponding ones of the 
document collections / parsing said input query text to select 
single-word terms and multiple-word phrase terms from said query text by. . . 

...word terms and punctuation; applying each such selected term against the 
meta-index values in said meta - index to determine correlation 
between the selected terms and the meta -index values; determining 
cumulative rankings for said document collections based upon said 
correlation relative to each such selected term normalized against said 
plurality of document collections ; andselecting a subset of said 
document collections having the highest relative cumulative rankings 
whereby said subset of . . . 

...collections to search using said input query text , searching each of said 
subset of document collections with said input query text to select 
documents correlating to said query text . 



...each iteration, said method comprising the steps of:a) obtaining a 
collection selection query including a set of predetermined search 
terms ;b) determining an inverse collection frequency for each member of 
said set of predetermined search terms with respect to each said 
database and said set of databases, and determining a document frequency 
for each member of said set of predetermined search terms with 
respect to each said database; c) determining a ranking value for each said 
database based on a sum of the products of said inverse collection 
frequencies for said set of predetermined search terms and said 
document frequencies for respective members of said set of search terms 
;d) selecting a subset of said set of databases based on ' predetermined 
criteria dependant on said ranking value for each said database; and e) 
selectively repeating portions of said steps (b) through (d) with 
respect to each member of said set of predetermined search terms for 
each iteration of said method. 
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Alerting Abstract 
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The automatic recognition method selects a block of data from the 
incoming data stream. It searches the block for indications of the presence 
or absence of the particular language in the data block. The data block is 
tested for a variety of known languages, the languages being applied in a 
predetermined order. 

The order of application of the languages is by increasing probability of 
an error of recognition occurring. The testing procedure looks for a 
characteristic element in the language, or a signature. 

Alternatively key words or particular synchronisation characters are 
sought. The width of the observation window is varied to suit the language 
being tested. From these test an interpretation module is selected for the 
numeric data stream. 

USE/ADVANTAGE - Printing or display of drawings from tracer data. 



Accurate and fast recognition of language used for transfer of numeric 
data . 



Title Terms/Index Terms /Additional Words: AUTOMATIC; RECOGNISE; LANGUAGE; 
CARRY; NUMERIC; DATA; APPLY; SEQUENCE; PART; STREAM; CHARACTERISTIC; 
PATTERN; SIGNATURE; DECIDE; TRANSLATION; MODULE; DRAWINGS; PRINTING; 
DISPLAY; TRACER 

Class Codes 

International Classification (Main) : G06F-015/38, G06F-017/40, G06F-003/12 
(Additional/Secondary): B41J-029/38, G06F-013/00, G06F-003/14, G06F-009/45 

, G06K-009/62 
International Classification {+ Attributes) 
IPC + Level Value Position Status Version 

G06F-0017/27 A I R 20060101 

G06F-0003/12 A I R 20060101 

G06F-0017/27 CI R 20060101 

G06F-0003/12 CI R 20060101 

US Classification, Issued: 382229000, 382181000, 395112000, 395114000, 

395500000, 395707000 

File Segment: EngPI; EPI; 
DWPI Class: TOl; P75; P86 

Manual Codes (EPI/S-X) : T01-C05A; T01-C05B; T01-D02; TOl-Jll; TOl-S 

...applies sequence of languages to part of data stream to look for 
characteristic patterns or signatures to decide translator module 



Original Publication Data by Authority 



Original Abstracts: 

...special synchronization characters or keywords, and then for languages 
using mnemonics made up of a determined number of significant characters 
, The method is used for automatically selecting an interpreter module 
for decoding the received data, in particular the data and then for 
languages using mnemonics made up of a determined number of significant 
characters. The method is used for automatically selecting an interpreter 
module for. . . 
Claims : 

...in that</b> recognition is performed by searching for a plurality of 
known languages in a order of increasing probability of recognition 
error proceeding, for each language, with a search in the data... block that 
tend to indicate the presence or the absence of a language; said step of 
seeking including the sub - steps of sequentially testing for a plurality 
of known languages according to a predetermined sequential arrangement... 

. . . andproceeding, for each one of said known languages, with a search in 
said block for at least one language element characteristic of said 
one of said known languages. 
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Alerting Abstract US Al 

NOVELTY - The system has a string matching unit to find occurrences in a 
document of key phrases associated with a selected subset of 
hypertext links (101, 102, 103, 104, 105) . A link installation unit inserts 
the hypertext links associated with the occurrences into the document 

submitted for hypertext link installation. An output unit returns a 
document corresponding to the document including the inserted hypertext 
links . 

DESCRIPTION - INDEPENDENT CLAIMS are also included for the following: 
l.a method for installing hypertext links in a document 



2. a system for retrieving hypertext links. 



USE - Used for installing a hypertext link in a document for access on 
the Internet through World Wide Web. 

ADVANTAGE - The link installation unit provides the link installation 
service, thereby automatically installing hypertext links within 
information submitted to the service by the hypertext authors. 

DESCRIPTION OF DRAWINGS - The drawing shows an example initial web page 
seen by a visitor using a web browser to access the online version of the 
service . 

101,- 102, 103, 104, 105Hypertext links 
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Alerting Abstract ...NOVELTY - The system has a string matching unit to 
find occurrences in a document of key phrases associated with a 
selected subset of hypertext links (101, 102, 103, 104, 105). A link 
installation unit inserts the... 

...a method for installing hypertext links in a documenta system for 
retrieving hypertext links... 

Original Publication Data by Authority 



Claims : 

...is claimed is:<b>l</b>. A system for installing hypertext links in a 
document, comprisinga) hierarchical database means containing hypertext 
links, wherein each of said hypertext links is associated with a set of 
key phrases ;b) link selection means for selecting a subset of said 
hypertext links, thereby forming a selected subset of hypertext links;c... 

...hypertext link installation; d) string matching means for finding 
occurrences in said submitted document of key phrases associated with 
said selected subset of hypertext links; e) link installation means for 
inserting into said submitted document hypertext... 

...a submitter ;matching means for finding an occurrence of at least one of 
the stored key phrases in the submitted text , thereby determining a" 
set of one or more found key phrases /link retrieval means for retrieving 
from the database... 
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Alerting Abstract WO Al 

NOVELTY - An interactive document retrieval system where the query 
processor performs the step of analyzing using a hybrid -method based on 
linguistic and mathematical approaches for an automatic text 
categorization. The captured document is analyzed to determine their text 



patterns which are commonly occurring and searchable phrases, pairing of 
words, with each pairing comprising two searchable words, where one word in 
each pairing occurs frequently within the document and the other word in 
each pairing occurs near the one word frequently within the document. 

DESCRIPTION - The knowledge base is initially constructed by analyzing 
indexed documents to which topics have previously been assigned, thereby 
determining the indexed documents word patterns , and then storing in 
the knowledge database these word patterns for the indexed documents and 
the topics assigned to these documents, and then relating the word 
pattern of an indexed document to the topic assigned to that same indexed 
document . 

INDEPENDENT CLAIMS are also included for the following: 

1. An interactive method of searching for and retrieving documents after 
receiving a search query from a requestor. 

2. A computer program. 

3. A mobile computing and/or telecommunications device comprising a 
graphical user interface capable of applying the WAP standard for- 
accessing documents from the Internet and/or any corporate network. 

USE - High-speed access search engines for information retrieval 
systems used in the Internet and/or corporate intranet domains for 
retrieving accessible documents using automatic text categorization 
techniques to support the presentation of search query results within 
high-speed network environments. 

ADVANTAGE - Provides an integrated, automatic and open information 
"retrieval system , comprising a hybrid method based on linguistic and 
mathematical approach for an automatic text categorization. 

Enables the possibility of meeting the requirements of all Internet users 
by means of the novel Internet archive. Provides desired information in a 
quick, simple and accurate manner that allows significant advantages with 
regard to data management within individual companies, 

DESCRIPTION OF DRAWINGS - The drawing is an overview block diagram of an 
indexed extensible, interactive retrieval system . 
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Internet high-speed access search engines for information retrieval 
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CATEGORY BASED, EXTENSIBLE AND INTERACTIVE SYSTEM FOR DOCUMENT RETRIEVAL 



...CATEGORY BASED, EXTENSIBLE AND INTERACTIVE SYSTEM FOR DOCUMENT 
RETRIEVAL 



...Category based, extensible and interactive system for document 
retrieval 



...CATEGORY BASED, EXTENSIBLE AND INTERACTIVE SYSTEM FOR DOCUMENT 
RETRIEVAL 

Alerting Abstract ...NOVELTY - An interactive document retrieval 
system where the query processor performs the step of analyzing using a 
hybrid method based on . . . 

. . .mathematical approaches for an automatic text categorization. The 
captured document is analyzed to determine their text patterns which 
are commonly occurring and searchable phrases, pairing of words, with each 
pairing comprising two... 

DESCRIPTION - The knowledge base is initially constructed by analyzing 
indexed documents to which topics have previously been assigned, thereby 
determining the indexed documents word patterns , and then storing in 
the knowledge database these word patterns for the indexed documents 
and the topics assigned to these documents, and then relating the word 
pattern of an indexed document to the topic assigned to that same indexed 
document . . . 

...USE - High-speed access search engines for information retrieval 
systems used in the Internet and/or corporate intranet domains for 
retrieving accessible documents using automatic text categorization 
techniques to support the presentation of search query results... 

. . .ADVANTAGE - Provides an integrated, automatic and open information 
retrieval system , comprising a hybrid method based on linguistic and 
mathematical approach for an automatic text categorization... 

...DESCRIPTION OF DRAWINGS - The drawing is an overview block diagram of an 
indexed extensible, interactive retrieval system . 

Original Publication Data by Authority 

Original Abstracts: 

An integrated, automatic and open information retrieval system (100) 
comprises an hybrid method based on linguistic and mathematical 
approaches for an automatic text categorization. It solves the problems of 
conventional systems by combining an automatic content recognition 
technique with a self -learning hierarchical scheme of indexed categories 
. In response to a word submitted by a requestor, said system (100) 
retrieves documents containing that word , analyzes the documents to 
determine their word -pair patterns , matches the document patterns 
to database patterns that are related to topics, and thereby assigns topics 
to each document . . . 

...assigned to more than one topic, a list of the document topics is 
presented to the requestor, and the requestor designates the relevant 
topics . The requestor is then granted access only to documents 
assigned to relevant topics. A knowledge database (1408) linking search... 

...In information retrieval (IR) systems with high-speed access, 
especially to search engines applied to the Internet and/or corporate 
intranet domains for retrieving accessible documents automatic text 
categorization. . . 



...of search query results within high-speed network environments , An 
integrated, automatic and open information retrieval system 
(<b>100</b>) comprises an hybrid method based on linguistic and 
mathematical approaches for an automatic text categorization. It solves 
the problems of conventional systems by combining an automatic content 
recognition technique with a self-learning hierarchical scheme of indexed 
categories. In response to a word submitted by a requester, said systemO ( 
<b>100</b> ) retrieves documents containing that word, analyzes the 
documents to determine their word -pair patterns , matches the document 
patterns to database patterns that are related to topics , and 
thereby assigns topics to each document. If the retrieved documents are 
assigned to more than one topic, a list of the document topics is presented 
to the requester, and the requester designates the relevant topics . The 
requester is then granted access only to documents assigned to relevant 
topics . A knowledge database {<b>1408</b>) linking search terms to 
documents and documents to topics is established and maintained to speed 
future ... 

...An integrated, automatic and open information retrieval system (100) 
comprises an hybrid method based on linguistic and mathematical approaches 
for an automatic text categorization. It solves the problems of 
conventional systems by combining an automatic content recognition 
technique with a self-learning hierarchical scheme of indexed categories. 
In response to a word submitted by a requestor , said system (100) 
retrieves documents containing that word, analyzes the documents to 
determine their word-pair patterns , matches the document patterns to 
database patterns that are related to topics , and thereby assigns 
topics to each document. If the retrieved documents are assigned to 
more than one topic, a list of the document topics is presented to the 
requestor , and the requestor designates the relevant topics . The 
requestor is then granted access only to documents assigned to relevant 
topics. A knowledge database (1408) linking search terms to documents and 
documents to topics is established and maintained to speed future 
searches. Additionally, new strategies are presented to deal... 

...de la combinaison d'une technique de reconnaissance de contenu classique 
et d'un processus hierarchique d ' auto-apprentissage de categories 
indexees. En reponse a un mot propose par un demandeur, le systeme (100) 
recupere des documents contenant ce meme mot, analyse ies documents pour 
determiner leurs structures de paires de mots, etablit une correspondance 
entre les structures de documents et... 
Claims : 

<b>l</b>. An interactive document retrieval system (<b>100</b>) 
designed to search for documents after receiving a search query from a 
requestor, said system comprising: a knowledge. . . 

...least one data structure (<b>202</b>, <b>208</b>, <b>210</b>, 
<b>212</b>, <b>214</b>, <b>216</b> and/or <b>218</b>) that relates text 
patterns to topics, and a query processor (<b>400</b>) that, in response 
to the receipt of a search query from a requester, performs the 
following steps : searching for and trying to capture documents containing at 
least one term related. . . 

...the search query, if any documents are captured, analyzing the captured 
documents to determine their text patterns , categorizing the captured 
documents by comparing each document's text pattern to the text 
patterns in the knowledge database (<b>200</b> ) , and if a document's text 
pattern is similar to a text pattern * in the knowledge database 

(<b>200</b>) , assigning to that document the similar word pattern 's 



related topic, presenting at least one list of the topics assigned to the 
categorized documents to the requester, and asking the requester to 
designate at least one topic from the list as a topic that is 
relevant to the requestor's search, andgranting the requestor access to the 
subset of captured and categorized documents to which topics designated 
by the requestor have been assigned, wherein the word patterns 
determined by analysis are pairings of words, each pairing comprising 
two searchable words with one word occurring frequently within the document 
and the other word occurring near the one word frequently within the 
document . 



13/69, K/4 (Item 4 from file: 350) 

DIALOG (R) File 350:Derwent WPIX 

(c) 2007 The Thomson Corporation. All rts. reserv. 

0010791617 - Drawing available 

WPI ACC NO: 2001-407058/200143 
XRPX Acc No: N2001-301085 

Computer implemented document topic arrangement for information retrieval 

system , involves displaying set of topics having semantic 
correspondence with topic selected initially, along with preset 
parameters 
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Alerting Abstract US Bl 

NOVELTY - A set of documents (152) which satisfies the user's query, is 
received by user interface (110) such as keyboard and a topic related to 
user query is selected from received document. According to user 
selection of topic arrangements, the topics having semantic 
correspondence to selected topic are selected , arranged and displayed 
along with preset parameters. 

DESCRIPTION - An INDEPENDENT CLAIM is also included for the information 
retrieval system . 

USE - Used for information retrieval system in information domains 
such as document management systems, library catalog, search engines for 
the world wide web, database, etc. 

ADVANTAGE - Improves understanding of organization, relationships and 
nature of content in a document collection through distinct topic 
arrangements, according to user interested queries and for interactively 
constructing topic and key word based queries for further navigating the 
document collection, is achieved. 

DESCRIPTION OF DRAWINGS - The figure shows the operation of the 
information retrieval system . 

110 User interface 

152 Documents 
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Computer implemented document topic arrangement for information retrieval 



system , involves displaying set of topics having semantic 
correspondence with topic selected initially, along with preset 
parameters 

Original Titles : 

Dynamic content organization in information retrieval systems . 

Alerting Abstract ...the user's query, is received by user interface 
(110) such as keyboard and a tppic related to user query is selected 
from received document. According to user selection of topic 
arrangements, the topics having semantic correspondence to selected 
topic are selected , arranged and displayed along with preset parameters 
DESCRIPTION - An INDEPENDENT CLAIM is also included for the information 
retrieval system . 

...USE - Used for information retrieval system in information domains 
such as document management systems, library catalog, search engines for 
the world. . . 

...DESCRIPTION OF DRAWINGS - The figure shows the operation of the 
information retrieval system . 

Original Publication Data by Authority 



Original Abstracts: 

...plurality of topics. Each topic expresses an idea or concept, and is 
associated with a set of terms which describe the topic, a set of 
documents in the document collection which are about the topic- Each topic 
also has • topic -subtopic relationships with selected other topics , 
forming local topic hierarchies . A query analysis module receives a 
current query and processes the query against the document... 

- . . .module processes the document set according to defined parameters and a 
user selection or automatic selection of a desired topic arrangement t 
create various types of topic arrangements. These topic arrangements 
include supertopics, subtopics, perspective. . . 
Claims : 

... comprising: processing the query to select a set of documents satisfying 
the query; receiving a selection of at least one topic derived from 
the query; determining the supertopic arrangement as a combination of 
supertopics that are associated with the documents of the document set and 
with the selected topic and that optimally generalizes the .document 
set with respect to parameters; anddisplaying the supertopic 
arrangement . 
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Alerting Abstract US A 

NOVELTY - Point of view gists are generated for atleast one document. A 
query is processed, which has a query term identifying topics related to 
the query. Point of view gists are selected from one or more documents in 
response to the query, to generate a new research document. 

DESCRIPTION - Themes which define an overall content for the document are 
stored. The themes relevant to the query are selected as queries and 
documents that contain the themes are selected. An INDEPENDENT CLAIM is 
also included for the computer readable medium. 

USE - In multidocument search and retrieval system for books, 
articles, periodicals . 

ADVANTAGE - Emulates the paradigm of a researcher by extracting portions 
of different documents to infer an answer to the- search query. Utilizes 
rich and comprehensive content processing system to accurately identify 
themes that define the content of the source material, 

DESCRIPTION OF DRAWINGS - The figure shows the search and retrieval 
system . 
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Alerting Abstract ...USE - In multidocument search and retrieval 
system for books, articles, periodicals... 



...DESCRIPTION OF DRAWINGS - The figure shows the search and retrieval 
system . 

Original Piibli cation Data by Authority ' 

Original Abstracts: 

A research mode in a search and retrieval system generates a research 

document that infers an answer to a query from multiple documents. The 
search and retrieval system includes point of view gists for 
documents to provide a synopsis for a corresponding document with a slant 
toward a topic. To generate a research document, the search and retrieval 

system processes a query to identify one or more topics related 
to the query , selects document themes relevant to the query, and then 
selects point of view gists, based on the document themes, that... 

...slant towards the topics related to the query. A knowledge base, which 
includes categories arranged hierarchically , is configured as a 
directed graph to links those categories having a lexical, semantic or 
usage association. Through use of the knowledge base, an expanded set of 
query terms are generated, and research documents are compiled that 
include point of view gists relevant to the expanded set of query terms 

A content processing system , which identifies the themes for a 
document and classifies the document themes in categories of... 
Claims : 

A method for processing a query in a search and retrieval system , said 

method comprising the steps of : generating a plurality of point of view 
gists for at least... 

...processing a query, which includes at least one query term, to identify 
a plurality of topics • related to said query; and selecting a plurality 
of point of view gists from one or more documents to generate, in... 

...research document, wherein said point of view gists selected comprise 
synopses with slants toward said topics related to said query. 
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Alerting Abstract US A 

NOVELTY - A character chain pattern containing code added data and 
information corresponding to each pattern are generated by pattern and 
information generators (15,17), respectively. A hierarchical division 
code is added to each keyword output from keyword generator to produce a 
keyword character chain pattern . 

DESCRIPTION - A division code is added to each segment of data stored in 
memory (11) according to their hierarchy level. A specific number is 
allocated to each segment of data by an allocation unit (14) . The usage 
frequency of each data segment corresponding to the set hierarchical 
division code is computed by a calculator (25) . A series of keyword 
character chain Ipatterns is obtained by arranging the primary and 
secondary patterns according to their division code and code added 
keywords . During retrieval, a specific character chain data 
corresponding to keyword chain pattern is collated by a collation unit 
according to preferential order set by setting unit to extract a series of 
particular character chain data. An index file representing character chain 
data with respect to chain patterns is produced by a production unit. Based 
on the input keyword, the keyword included in each character chain data is 
compared and corresponding record is retrieved and displayed on a display 
unit (23) . 

USE - For RDBMS. 

ADVANTAGE - Each segment of character data is easily identified, by 
'judging the division code of each pattern accordingly. A particular 
character chain data series corresponding to keyword patterns is generated 
at high speed by using set division code and number data. The data 
corresponding to keyword is also identified easily, during retrieval. 
Retrieval speed is raised, by reducing number of character chain 
patterns and character chain information in index file. 

DESCRIPTION OF DRAWINGS - The figure shows the block diagram of 
information retrieval support system . 
11 Memory 

14 Allocation unit 

15,17 Pattern and information generators 

23 Display unit 
25 Calculator 

Title Terms/Index Terms /Additional Words: INFORMATION; RETRIEVAL; SUPPORT; 
SYSTEM 



Class Codes 

International Classification (Main) : G06F-017/30 

US Classification, Issued: 707101000, 707100000, 707102000, 707001000, 
707006000, 707007000 

File Segment: EPI; 
DWPI Class: TOl 

Manual Codes (EPI/S-X) : T01-J05B1; T01-J05B3; T01-J05B4B; T01-J05B4M; 
T01-J05B4P 

Information retrieval support system for RDBMS 
Original Titles: 

Information retrieval system for retrieving a record of data 
including a keyword. 

Alerting Abstract ...NOVELTY - A character chain pattern containing 
code added data and information corresponding to each pattern are generated 
by pattern and information generators (15,17), respectively. A - 
hierarchical division code is added to each keyword output from keyword 
generator to produce a keyword character chain pattern . ...code is 
added to each segment of data stored in memory (11) according to their 
hierarchy level. A specific number is allocated to each segment of data by 
an allocation unit (14) . The usage frequency of each data segment 
corresponding to the set hierarchical division code is computed by a 
calculator (25) . A series of keyword character chain patterns is 
obtained by arranging the primary and secondary patterns according to their 
division code and code added keywords . During retrieval, a specific 
character chain data corresponding to keyword chain pattern is collated 
by a collation unit according to preferential order set by setting... 
... Each segment of character data is easily identified, by judging the 
division code of each pattern accordingly. A particular character chain 
data series corresponding to keyword patterns is generated at high speed by 
using set . . . 

...keyword is also identified easily, during retrieval. Retrieval speed is 
raised, by reducing number of character chain patterns and character 
chain information in index file... 

. .\ DESCRIPTION OF DRAWINGS - The figure shows the block diagram of 
information retrieval support system . 



Original Publication Data by Authority 



Original Abstracts: 

...of adjacent data, so that a piece of .code-added data is produced. 
Thereafter, a character chain pattern (CI, C2 ) of each pair of 
adjacent characters CI and C2 in the code-added data is prepared, an 

occurrence frequency Fi of... 

...data H is calculated, and character chain information (F1,F2,H,L) 
corresponding to each character chain pattern (CI, C2 ) is prepared , 
Therefore, a index file in which pieces of character chain information 
prepared for the same... 

...ends of the keyword to produce a code-added keyword, and a series of 
keyword character chain patterns is prepared from the code-added 
keyword in the same manner. Thereafter, a series of particular character 



chain information (F1,F2,H,L) corresponding to the series of keyword 
character chain patterns on condition that F2 of first information 
equals to Fl of second information following the first information and data 

Claims : 

An information retrieval system , comprising : data record storing 
means for storing a plurality of data records, pieces of data respectively 
composed. . . 

...a top position of the piece of data and a current position of the 
current character ; character chain pattern preparing means for 
preparing a first character chain pattern of each pair of 
characters adjacent to each other in one code -added data produced by 
the division code adding means for each code-added data and preparing a 
second character chain pattern of each pair of one division code and 
one character adjacent to the division code in one code-added data for 
each code-added data; character chain... 

...preparing a piece of character chain information corresponding to each 
of the first and second character chain patterns prepared by the 
character chain pattern preparing means according to the occurrence 
frequencies obtained by the character occurrence frequency 
calculating means and the data numbers and the record numbers allocated by 
the number allocating means... 

...one character and one division code of one piece of data corresponding 
to one first or second character chain pattern , one data number of 
the piece of data and one record number of the piece of data; index 
file preparing means for preparing an index file for the particular... 

...the character chain information preparing means are listed in 
correspondence to the first and second character chain patterns 
prepared by the character chain pattern preparing means; keyword 
preparing means for preparing a keyword composed of a plurality of 
characters ; keyword division code adding means for adding the division 
code to one end or both ends ... 

...prepared by the keyword preparing means to produce a piece of code-added 
keyword; keyword character chain pattern preparing means for preparing 
a first keyword character chain pattern of each pair of characters 
adjacent to each other in the code -added keyword produced by the 
keyword division code adding means , preparing a second keyword 
character chain pattern of each pair of one division code and one 
character adjacent to the division code in the code-added keyword , and 
preparing a series of keyword character chain patterns by arranging 
the first and second keyword character chain patterns in the order 
of the characters and division codes in the code -added keyword; data 
record retrieving means for extracting a series of particular 
character chain information corresponding to the series of keyword 
character chain patterns prepared by the keyword character chain 
pattern preparing means from the index file for the particular item 
prepared by the index file preparing means on condition that a 
plurality of data numbers in the pieces of particular character chain 
information agree with each other, a plurality... 
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Alerting Abstract EP A2 

The character recognition device includes a word dictionary (a6) which 
stores word identification information and hierarchy information for 
layering words into a hierarchy and for recognising each of the words 
within the hierarchy . A character transition probability table {a4) 
stores at least probabilities of transitions from any one character to 
another, and those pieces of the word identification information which 
correspond to combinations of characters resulting from the 
transitions . 

A character transition probability table (a4) is used in optimising 
candidate character strings obtained by the character recognition 
device. The word dictionary is searched for words defined by those pieces 
of the word identification information which correspond to the optimised 
candidate string, thereby retrieving the searched words which are 
identified by the applicable pieces of the hierarchy information and 
which have yet to be input . 

USE - Processing slips, invoices, forms etc. 

ADVANTAGE - Eliminates need for operator to write whole of character 
string by hand. Fast operation. Allows lower hierarchy level data to be 
selected without knowing higher levels. 
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...dictionary which is searched for word defined by identification 
information which corresponds to optimised candidate character string 
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strings 



by inference, 
by inference. 



.Apparatus for recognizing input character strings by inference. 



.Apparatus for recognizing input character strings 
.Apparatus for recognizing input character strings 



by inference, 
by inference 



Alerting Abstract ...The character recognition device includes a word 
dictionary (a6) which stores word identification information and hierarchy 
information for layering words into a hierarchy and for recognising each 
of the words within the hierarchy . A character transition probability 
table (a4) stores at least probabilities of transitions from any one 
character to another, and those pieces of the word identification 
information which correspond to combinations of characters resulting 



from the transitions. 



. , .A character transition probability table (a4) is used in optimising 
candidate character strings obtained by the character recognition 
device. The word dictionary is searched for words defined by those pieces 
of the. . . 

...string, thereby retrieving the searched words which are identified by 
the applicable pieces of the hierarchy information and which have yet to 
be input ... 

. . .ADVANTAGE - Eliminates need for operator to write whole of character 
string by hand. Fast operation. Allows lower hierarchy level data to be 
selected without knowing higher levels. 

Original Publication Data by Authority 
Original Abstracts: 

An object of the present invention is to provide a character recognition 
apparatus for inferring the entire character string solely from a 
user -input handwritten keyword and displaying the inferred result as a 
candidate character string . The apparatus of the invention 
comprises: a word dictionary a6 storing word identification information 
and hierarchy information for layering a plurality of words into a 
hierarchy and for recognizing each of the words within the hierarchy ; 
a character transition probability table a4 storing probabilities of 
transitions from any one character to another, and those pieces of the word 
identification information which correspond to combinations of 
characters resulting from the transitions ; and an optimization unit 
for using the character transition probability table a4 in optimizing 
candidate character strings obtained by a recognition unit. The 
word dictionary a6 is searched for a word defined by the word 
identification information which corresponds to the optimized candidate 
character string , whereby the searched word is retrieved which 
applies to the hierarchy information and which has yet to be input... 

...of the present invention is to provide a character recognition apparatus 
for inferring the entire character string solely from a user-input 
handwritten keyword and displaying the inferred result as a candidate 
character string .The apparatus of the invention comprises: a word 
dictionary a<b>6 </b>storing word identification information and 
hierarchy information for layering a plurality of words into a hierarchy 
and for recognizing each of the words within the hierarchy ; a character 
transition probability table a<b>4 </b>storing probabilities of 
transitions from any one character to another, and those pieces of the 
word identification information which correspond to combinations of 
characters resulting from the transitions; and an optimization unit for 
using the character transition probability table a4 in optimizing 
candidate character strings obtained by a recognition unit. The word 
dictionary a<b>6 </b>is searched for a word defined by the word 
identification information which corresponds to the optimized candidate 
character string , whereby the searched word is retrieved which applies 
to the hierarchy information and which has yet to be input. 

. . .A character recognition apparatus for inferring the entire character 
string solely from a user-input handwritten keyword and displaying the 
inferred result as a candidate character string . The apparatus of the 



invention comprises : a word dictionary storing word identification 
information and hierarchy information for layering a plurality of words 

into a hierarchy and for recognizing each of the words within the 
hierarchy ; a character transition probability table a4 storing 
probabilities of transitions from any one character to another, and those 
pieces of the word identification information which correspond to 
combinations of characters resulting from the transitions; and an 
optimization unit for using the character transition probability table in 
optimizing candidate character strings obtained by a recognition 
unit. The word dictionary is searched for a word defined by the word 
identification information which corresponds to the optimized candidate 

character string , whereby the searched word is retrieved which 
applies to the hierarchy information and which has yet to be input. 



. . .A character recognition apparatus for inferring the entire character 
string solely from a user-input handwritten keyword and displaying the 
inferred result as a candidate character string .The apparatus of the 
invention comprises: a word string includes a word hierarchy 
and for recognizing each of the words within the hierarchy ; a character 
transition probability table a<b>4 </b>storing probabilities of 
transitions from any one character to another, and those pieces of the 
word identification information which correspond to combinations of 
characters resulting from the transitions; and an optimization unit for 
using the character transition probability table in optimizing candidate 
character strings obtained by a recognition unit. The word dictionary is 
searched for a word defined by the word identification information 
which corresponds to the optimized candidate character string , whereby 

the searched word is retrieved which applies to the hierarchy 
information and which has yet to be input. 
Claims : 

1. A character recognition apparatus having recognition means *{a3) for 
recognizing input character strings and display means (a8) for 
displaying recognized results, said character recognition apparatus 
comprising; </br> a word dictionary (a6) storing word identification 
information and hierarchy information for layering a plurality of words 
into a hierarchy and for recognizing each of said words within said 
hierarchy ;</br> a character transition probability table (a4) storing 
at least probabilities of transitions from any one character to another, 
and those pieces of said word identification information which correspond 
to combinations of characters resulting from said transitions ;</br> 
optimization means (a5) for using said character transition probability 
table {a4) in optimizing candidate character strings obtained by said 
recognition means (a3) ; and </br> retrieval means for searching 
through said word dictionary (a6) for words defined by those pieces of said 
word identification information which correspond to the optimized candidate 

character string , thereby retrieving the searched words which are 
identified by the applicable pieces of said hierarchy information and 
which have yet to be input. 



...die Vorrichtung zum Erkennen von Zeichen umfasst;ein Worterbuch (a6) zum 
Speichern von Wortidentif izierungsinf ormation und Hierarchieinif ormation 
zum Einordnen einer Mehrzahl von Worten in eine Hierarchie und zum 
Erkennen jedes dieser Worter innerhalb der Hierarchie ,eine 
Zeichenumwandlungswahrscheinlichkeitstabelle (a4) zum Speichern von 
wenigstens Wahrscheinlichkeiten von Umwandlungen eines beliebigen Zeichens 



in ein anderes und solche Stucke der Wortidentif izierungsinf ormation, die 
Kombinationen von Zeichen entspricht, die aus diesen Umwandlungen 
result ieren . . . 

. . . Zeichenketten entsprechen, wodurch die gesuchten Worte abgefragt werden, 
die identif iziert werden durch anwendbare Stucke der Hierarchieinf ormation 
und die noch eingegeben werdenmussen. 



. . .A character recognition apparatus having recognition means (a3) for 
recognizing input character strings and display means {a8) for 
displaying recognized results, said character, recognition apparatus 
comprising:a word dictionary (a6) storing word identification information 
and hierarchy information for layering a plurality of words into a 
hierarchy and for recognizing each of said words within said hierarchy ,a 
character transition probability table (a4) storing at least 
probabilities of transitions from any one character to another , and those 
pieces of said word identification information which correspond to 
combinations of characters resulting from said transitions /optimization 
means (a5) for using said character transition probability table (a4) in 
optimizing candidate character strings obtained by said recognition 
means (a3) / and retrieval means for searching through said word 
dictionary (a6) for words defined by those pieces of said word 
identification information which correspond to the optimized candidate 
character string , thereby retrieving the searched words which are 
identified by the applicable pieces of said hierarchy information and 
which have yet to be input. , • 



. . . dictionnaire de mots (a6) stockant des informations d * identif ication des 
mots et des informations de hierarchie pour structurer une pluralite de 
mots en une hierarchie et pour reconnaitre chacun desdits mots au sein de 
ladite hierarchie ; un tableau de probabilites de transition des caracteres 
(a4) stockant au moins des probabilites de transition d'un caractere 
quelconque a un autre, et les parties desdites informations 
d ' identif ication des mots qui correspondent a des combinaisons de 
caracteres resultant desdites transitions ;un moyen d ' optimisation (a5) 
pour utiliser ledit tableau de probabilites de transition des caracteres... 

. . . d ' extraire les mots cherches qui sont identifies par les parties 
utilisables desdites informations de hierarchie et qui doivent etre 
entres. a present. . . 

... What is claimed is:<b>l</b>. A character recognition apparatus having 
recognition means for recognizing input character strings and display 
means for displaying recognized results, said character recognition 
apparatus comprising: a word dictionary storing word identification 
information and hierarchy information for layering a plurality of words 
into a hierarchy and for recognizing each of said words within said 
hierarchy ;a character transition probability table storing at least 
probabilities of transitions from any one character to another, and those 

pieces of said word identification information which correspond to 
combinations of characters resulting from said transitions ; opt imizat ion 
means for using said character transition probability table in optimizing 
candidate character strings obtained by said recognition means; 
andretrieval means for searching through said word dictionary for words 
defined by those pieces of said word identification information which 
correspond to the optimized candidate character string , thereby 



retrieving the searched words which are identified by the applicable 
pieces of said hierarchy information and which have yet to be input. 



. . .A character recognition apparatus having recognition means for 
recognizing input character strings and display means for displaying 
recognized results, said character recognition apparatus comprising:a word- 
dictionary storing word identification information and hierarchy 
information for layering a plurality of words into a hierarchy and for 
recognizing each of said words within said hierarchy;a character transition 
probability table storing at least probabilities of transitions from any 
one character to another, and those pieces of said word identification 
information which correspond to combinations of • characters resulting from 
said transitions; optimization means for using said character transition 
probability table in optimizing candidate character strings obtained by 
said recognition means; andretrieval means for searching through said word 
dictionary for words defined by those pieces of said word identification 
information which correspond to the optimized candidate character string, 
thereby retrieving the searched words which are identified by the 
applicable pieces of said hierarchy information and which have yet to be 
input. 



...What is claimed is:l. A search system for searching a multi -item 
data base, said system comprising : said multi-item data base; a search 

-object specification table for specifying items as search objects;an 
attribute definition table for specifying... 

...a display step of a search' result and an indicator indicating how much 
said search result must match a keyword or a minimum number of matching 
searched items in said search result • before said search 
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Alerting Abstract US A 

The document-creating support apparatus includes an 

input device for inputting a character string to be proofread, and an 
editing device functionally connected to the input device for producing an 
edited sentence. A proofread information accumulating device functionally 
connected to the editing device includes a memory having retrievable 
information stored therein according to a multi-level hierarchical 
classification. The retrievable information is stored in records which are 
linked in the memory by arcs. Each arc has both a pointer to a memory 
location, an indication of an information type stored at the memory 
location pointed to by the pointer, and an indication of a level in the 
multi-level hierarchical classification. 

A retrieving device is functionally connected to the input device and the 
proofread information accumulating device, for specifying a keyword to 
be proofread in the character string , for retrieving retrievable 
information related to the keyword from the information accumulating device 
in accordance with an information type. The proofreading device is 
functionally connected to the retrieving device for selecting retrievable 
information to replace the keyword . 

USE/ADVANTAGE - Implements document proofreading and creating functions 
by utilising information applied to arcs which link correlated retrieving 
information. Enhanced proofreading accuracy and efficiency. 
Title Terms/Index Terms/Additional Words: DOCUMENT; SUPPORT; APPARATUS; 
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Original Titles: 

Document processing support system using keywords to retrieve 
explanatory information linked together by correlative arcs 

Alerting Abstract ...input device for inputting a character string to 
be proofread, and an editing device functionally connected to the input 
device for producing... 

...editing device includes a memory having retrievable information stored 
therein according to a multi-level hierarchical classification. The 
retrievable information is stored in records which are linked in the memory 
by... 

...pointed to by the pointer, and an indication of a level in the 
multi-level hierarchical classification. . . 

...device is functionally connected to the input device and the proofread 
information accumulating device, for specifying a keyword to be 
proofread in the character string , for retrieving retrievable 
information related to the keyword from the information accumulating device 
in accordance with an information type. The proofreading device is 
functionally connected to the retrieving device for selecting retrievable 
information to replace the keyword . 

Title Terms. /Index Terms/Additional Words: HIERARCHY ; 
Original Publication Data by Authority 

Original Abstracts: 

...editing input information. Input information is received from an input 
means (1) from which a keyword is selected for editing purposes. A 
low level record is located in a memory (5) which has the keyword of the 
input information stored therein. The low level record having the keyword 
stored therein is accessed to determine the location in the memory of 
higher level records and information types of the higher level records. The 
higher . . . 

...keyword, a synonym for the keyword, an example usage of the keyword, 
etc. A display ( 55 ) is provided with an indication of the information 
types stored in the memory for the keyword so that a user can select 
a desired information type . The. explanatory information associated with 
the desired information type is then displayed to the user... 
Claims : 

A document-creating support apparatus, comprising: input means for 
inputting a character string to be proofread; editing means 
functionally connected to said input means for producing an edited 
sentence; proofread. . . 

. . .means and including a memory having retrievable information stored 
therein according to a multi-level hierarchical classification, said 
retrievable information being stored in records which are linked in the 
memory by arcs, each... 

. . .pointed to by the pointer, and an indication of a level. in said 
multi-level hierarchical classification; retrieving means functionally 



connected to said input means 'and said proofread information accumulating 
device for receiving said character string to proofread , for 
specifying a keyword to be proofread in said character string , 
for retrieving retrievable information related to said keyword from said 
proofread information accumulating device in accordance with an information 
type; and proofreading means functionally connected to said retrieving 
means for selecting retrievable information to replace the keyword, > 
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Alerting Abstract WO A 

The information retrival method comprises the steps of defining as a 
'criterion key' that key-word which among all the keywords associated with 
any of the texts in the first group of texts , is associated with the 
largest number of texts within the first group . The first group is 
separated into 2 sub-groups, the first sub- group of texts having the 
criterion key as one of its keywords and the second sub-group not including 
the criterion key. 

Results obtained from the above steps are then displayed. The above 
process is applied recursively to at least one of the two sub-groupts. 

ADVANTAGE - Distinguishes between text areas having sense words but 
different meanings . 

Equivalent Alerting Abstract US A 

The method uses a processor and associated memory to make explicit the 
relationships among texts in a text base stored in the memory. The 
relationships are other than those provided by a user. Each text in the 
text base of texts associated with keyboard. The method involves the 
processor accepting from the user a search request of a search to be 
performed to locate a first groups of texts . The processor performs the 
search request described by the user among the keywords associated with the 
texts in the text base to locate the first group of texts with 
associated keywords matching the search request. For each of the keywords 



associated with the texts in the first group , the processor counts the 
number of texts associated with each of the keywords. The processor 
compares the number of texts the sub- group not separated into further 
sub-groups on the display medium. 
(28pp) 

Equivalent Alerting Abstract US A 

The computerised information retrieval system is formed of a text 
base of texts of variable length and content. The texts are selected from 
the text base on the basis of Boolean logic searches among keywords 
associated with the texts . When a group is retrived from such a search, 
the system automatically segregates the texts based on the presence of 
absence of a criterion key keyword selected so as to segregate the 
texts into sub-gps. The same criterion key analysis can then be applied 
recursively to the sub-gps. The criterion kay analysis can then be applied 

recursively to the sub-gps. The resulting sub-gps. are then displayed to 
the user in a hierachilical display to illustrate the relationships among 
the texts . A string comparison routine is also disclosed to search for 
similar keywords. 
(28pp) 
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keywords associated with any of the texts in the first group of texts 

, is associated with the largest number of texts within the first group 
. The first group is separated into 2 sub-groups, the first sub- group of 
texts having the criterion key as one of its keywords and the second 
sub-group not ... 



...Results obtained from the above steps are then displayed. The above 
process is applied recursively to at least one of the two sub-groupts . . . 

Equivalent Alerting Abstract ...the user a search request of a search to 
be performed to locate a first groups of texts . The processor performs 
the search request described by the user among the keywords associated with 
the texts in the text base to locate the first group of texts with 

associated keywords matching the search request. For each of the keywords 
associated with the texts in the first group , the processor counts the 
number of texts associated with each of the keywords. The processor 
compares the number of texts the sub- group not separated into further 
sub-groups on the display medium. . . 

...The computerised information retrieval system is formed of a text 
base of texts of variable length and content. The texts... 

...the text base on the basis of Boolean logic searches among keywords 
associated with the texts . When a group is retrived from such a search, 
the system automatically segregates the texts based on the presence of 
absence of a criterion key keyword selected so as to segregate the 
texts into sub-gps . The same criterion key analysis can then be applied 
recursively to the sub-gps . The criterion kay analysis can then be applied 

recursively to the sub-gps. The resulting sub-gps. are then displayed to 
the user in a hierachilical display to illustrate the relationships among 
the texts . A string comparison routine is also disclosed to search for 
similar keywords... 

Original Publication Data by Authority 



Original Abstracts: 

A computerized information retrieval system provides a break-down of 
the major and minor subject areas covered by a group of texts 
associated with a set of descriptive keywords. The system makes explicit 
the underlying informational structure of the group as a whole, by 
organizing the texts into sub- groups defined by the keywords which the 
texts of each sub- group have in common. The process being recursive , 
the sub-groups of each sub-group can be found to any desired depth. An 
additional method of analysis provides a measure of the degree of 
similarity between words or between collections of words such as 
sentences. The method provides a facility for searching for text whose 
keywords are ... 

...user's search request and a facility for selecting texts from a textbase 
and/or ordering the lists of texts found according to the degree of 
similarity between the user's search description and textual... 

...from the textbase on the basis of Boolean logic searches among keywords 
associated with the texts . When a group is retrieved from such a 
search, the system automatically segregates the texts based on the presence 
or absence of a criterion key keyword selected so as to segregate the 
texts into sub- groups . The same criterion key analysis can then be 
applied recursively to the sub-groups. The resulting sub-groups are then 
displayed to the user in a hierarchical display to illustrate the 
relationships amoung the texts . A string comparison routine is also 
disclosed to search for similar keywords... 

. . .A computerized information retrieval system is formed of a textbase 
of texts of variable length and content. The texts are... 



...from the textbase on the basis of Boolean logic searches among keywords 
associated with the texts . When a group is retrieved from such a 
search, the system automatically segregates the texts based on the presence 
of absence of a criterion key keyword selected so as to segregate the 
texts into sub- groups . The same criterion key analysis can then be 
applied recursively to the sub-groups. The criterion key analysis can 
then be applied recursively to the sub-groups. The resulting sub-groups 
are then displayed to the user in a hierarchical display to illustrate 
the relationships among the texts . A string comparison routine is also 
disclosed to search for similar keywords... 

. . .A computerized information retrieval system provides a break-down of 
the major and minor subject areas covered by a group of texts 
associated with a set of descriptive keywords. The system makes explicit 
the underlying informational structure of the group as a whole, by 
organizing the texts into sub- groups defined by the keywords which the 
texts of each sub- group have in common. The process being, recursive , 
the sub-groups of each sub-group can be found to any desired depth. An 
additional method of analysis provides a measure of the degree of 
similarity between words or between collections of words such as 
sentences. The method provides a facility for searching for text whose 
keywords are . . . 

...user's search request and a facility for selecting texts from a textbase 
and/or ordering the lists of texts found according to the degree of 
similarity between the user's search description and textual... 
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ABSTRACT 



PROBLEM TO BE SOLVED: To extract a required classification by realizing the 
AND retrieval of different words in high- order and low-order 
classifications even in the case of retrieval with plural words and to 
extract the classification of a designated rank by designating and 
retrieving the rank of a classification hierarchy with plural words. 

SOLUTION: This method comprises a process (a) for inputting plural 
keywords, a process (b) for retrieving the titles of respective 
classifications with one of inputted keywords, a process (c) for preparing 
a direct high-order classification group including high-order 
classifications for each of all the classifications of keywords retrieved 
in the process (b) , a process (d) for retrieving the titles of 
respective classifications in the direct high-order classification group 
prepared in the process (c) with all the other keywords inputted in the 
process (a), a process (e) for extracting the direct high-order 
classification group, which includes all the other keywords in the process 
(d) , and defining it as a retrieved answer corresponding to the keyword 
used in the process (b) , a process (f ) for finding a retrieved answer 
corresponding to each of keywords by repeating the processes (b)-(e) 
concerning the keywords except for the keyword selected in the 
process (b) , and a process (g) for outputting the retrieved answers 



prepared in the processes (e) and (f ) . 
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ABSTRACT 

PURPOSE: To ensure the integrated registration of key words, tlie sure 
retrieval of documents, and tlie quiclc comparison by selecting the key 

word items turned into a hierarchical, form by an interactive process 

and designating the key /^wgrds-^in .. combinations Q 



CONSTITUTION: The most significant hierarchy of a group of key word 

items stored in a storage part is displayed and the item group of the next 
higher hierarchy related to one item if selected by an input device 5 is 



displayed. Such an interactive 
group of the lowest hierarchy . 
item numbers of all selected 
part. Then an input K.C. is 
corresponding to the document 



process is carried out down to the item 
. A key word code (K.C.) consisting of the 
hierarchies is supplied to a retrieving 
stored in a key word storing part 9 
retrieving information designated by a 
control part 2 in a registration mode. In a document retrieving mode the 
document retrieving information coincident with the input K.C. is read out 
of the part 9 and given to the part 2. Then the document retrieving 
information is designated at a retrieving part 10 in case of registration 
and the corresponding document is detected out of a document storing part 1 
based on the document retrieving information in case of the document 
retrieval . 
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PURPOSE: To 
resembles a 



ABSTRACT 

facilitate operation by detecting the item, which most 
keyword segmented from an input character string , in the 
range from a noticed item to items a prescribed level lower than this item 
out of items hierarchically provided in the data base to retrieve a 
desired item.- 



CONSTITUTION: A reading part 4, a resemblance degree calculating part 5, 
and an item specifying part 6 are provided. The reading part 4 succesively 
reads out all items in the range from a noticed item to items in the 

hierarchy a prescribed level lower than item in the data base as 
candidates corresponding to the keyword segmented from the input character 
string , and the resemblance degree calculating part 5 calculates the 
degree of resemblance between each item read by the reading part 4 and the 

keyword , and the item specifying part 6 specifies an item based on the 
degrees of resemblance calculated by the resemblance degree calculating 
part 5. The item specified by the item specifying part 6 is noticed, and 
all items in the range from this item to items in the hierarchy the 
prescribed level lower than this item are read out, and degrees of 
resemblance between these items and the next keyword are calculated to 

specify an item, and this operation is repeatedly executed to retrieve a 
desired item. Thus, the operation for item selection is facilitated. 



